Morphological Analysis of Sahidic Coptic for Automatic Glossing

نویسندگان

  • Daniel Smith
  • Mans Hulden
چکیده

We report on the implementation of a morphological analyzer for the Sahidic dialect of Coptic, a now extinct Afro-Asiatic language. The system is developed in the finite-state paradigm. The main purpose of the project is provide a method by which scholars and linguists can semi-automatically gloss extant texts written in Sahidic. Since a complete lexicon containing all attested forms in different manuscripts requires significant expertise in Coptic spanning almost 1,000 years, we have equipped the analyzer with a core lexicon and extended it with a ‘guesser’ ability to capture out-of-vocabulary items in any inflection. We also suggest an ASCII transliteration for the language. A brief evaluation is provided.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computational Methods for Coptic: Developing and Using Part-of-Speech Tagging for Digital Scholarship in the Humanities

This paper motivates and details the first implementation of a freely available part of speech tag set and tagger for Coptic. Coptic is the last phase of the Egyptian language family and a descendent of the hieroglyphs of ancient Egypt. Unlike classical Greek and Latin, few resources for digital and computational work have existed for ancient Egyptian language and literature until now. We evalu...

متن کامل

Computational Methods for Coptic

This paper motivates and details the first implementation of a freely available part of speech tag set and tagger for Coptic. Coptic is the last phase of the Egyptian language family and a descendant of the hieroglyphs of ancient Egypt. Unlike classical Greek and Latin, few resources for digital and computational work have existed for ancient Egyptian language and literature until now. We evalu...

متن کامل

Second Position Clitics and Monadic Second-Order Transduction

The simultaneously phonological and syntactic grammar of second position clitics is an instance of the broader problem of applying constraints across multiple levels of linguistic analysis. Syntax frameworks extended with simple tree transductions can make efficient use of these necessary additional forms of structure. An analysis of Sahidic Coptic second position clitics in a context-free gram...

متن کامل

Automatic interlinear glossing as two-level sequence classification

Interlinear glossing is a type of annotation of morphosyntactic categories and crosslinguistic lexical correspondences that allows linguists to analyse sentences in languages that they do not necessarily speak. Automatising this annotation is necessary in order to provide glossed corpora big enough to be used for quantitative studies. In this paper, we present experiments on the automatic gloss...

متن کامل

A Fault Diagnosis Method for Automaton based on Morphological Component Analysis and Ensemble Empirical Mode Decomposition

In the fault diagnosis of automaton, the vibration signal presents non-stationary and non-periodic, which make it difficult to extract the fault features. To solve this problem, an automaton fault diagnosis method based on morphological component analysis (MCA) and ensemble empirical mode decomposition (EEMD) was proposed. Based on the advantages of the morphological component analysis method i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016